AUTOMATED REGISTRATION OF 3-D MEDICAL SCANS OF SIMILAR 
ANATOMICAL STRUCTURES 



Cross-Reference to Related Applications 
5 Related applications are: 

"Density Nodule Detection in 3-Dimensional Medical Images," attorney 
docket number 8498-035-999, filed concurrently herewith; 

"Method and System for the Display of Regions of Interest in Medical 

Images," Serial No. , filed November 21, 2001, attorney docket number 

10 8498-039-999; 

"Vessel Segmentation with Nodule Detection," attorney docket number 8498- 
042-999, filed concurrently herewith; 

"Lung Field Segmentation from CT Thoracic Images," attorney docket 
number 8498-044-999, filed concurrently herewith, each of which is incorporated 
15 herein by reference; 

"Pleural Nodule Detection from CT Thoracic Images," attorney docket 
number 8498-045-999, filed concurrently herewith, each of which is incorporated 
herein by reference; and 

"Graphical User Interface for Display of Anatomical Information," Serial 

20 No . , filed November 2 1 , 200 1 , claiming priority from Serial No. 

60/252,743, filed November 22, 2000 and claiming priority to the U.S. Provisional 
Application Serial Number 60/314,582 filed August 24, 2001. 

This application hereby incorporates by reference the entire disclosure, 
drawings and claims of each of the above-referenced applications as though fully set 
25 forth herein. 

Field of the Invention 

This invention relates to image registration in medical diagnostic systems and 
methods for constructing three-dimensional representations from two or three- 
30 dimensional image data sets. The invention includes methods and systems for image 
registration and correlation of a three-dimensional data image volume to another 
three-dimensional data image volume. 
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Background of the Invention 

The diagnostically superior information available from data acquired from 
various imaging systems, especially that provided by multidetector CT (multiple 
slices acquired per single rotation of the gantry) where acquisition speed and 
volumetric resolution provide exquisite diagnostic value, enables the detection of 
potential problems at earlier and more treatable stages. Given the vast quantity of 
detailed data acquirable from imaging systems, various algorithms must be developed 
to efficiently and accurately process image data. With the aid of computers, advances 
in image processing are generally performed on digital or digitized images. 

Digital acquisition systems for creating digital images include digital X-ray 
film radiography, computed tomography ("CT") imaging, magnetic resonance 
imaging ("MRI") and nuclear medicine imaging techniques, such as positron emission 
tomography ("PET") and single photon emission computed tomography ("SPECT"). 
Digital images can also be created from analog images by, for example, scanning 
analog images, such as typical x-rays, into a digitized form. Further information 
concerning digital acquisition systems is found in our above-referenced copending 
application "Graphical User Interface for Display of Anatomical Information". 

Digital images are created from an array of numerical values representing a 
property (such as a grey scale value or magnetic field strength) associable with an 
anatomical location referenced by a particular array location. In 2-D digital images, 
or slice sections, the discrete array locations are termed pixels. Three-dimensional 
digital images can be constructed from stacked slice sections through various 
construction techniques known in the art. The 3-D images are made up of discrete 
volume elements, also referred to as voxels, composed of pixels from the 2-D images. 
The pixel or voxel properties can be processed to ascertain various properties about 
the anatomy of a patient associated with such pixels or voxels. 

Once in a digital or digitized format, various analytical approaches can be 
applied to process digital anatomical images and to detect, identify, display and 
highlight regions of interest (ROI). For example, digitized images can be processed 
through various techniques, such as segmentation. Segmentation generally involves 
separating irrelevant objects (for example, the background from the foreground) or 
extracting anatomical surfaces, structures, or regions of interest from images for the 
purposes of anatomical identification, diagnosis, evaluation, and volumetric 
measurements. Segmentation often involves classifying and processing, on a per- 
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pixel basis, pixels of image data on the basis of one or more characteristics associable 
with a pixel value. For example, a pixel or voxel may be examined to determine 
whether it is a local maxima or mimima based on the intensities of adjacent pixels or 
voxels. 

5 Once anatomical regions and structures are constructed and evaluated by 

analyzing pixels and/or voxels, subsequent processing and analysis exploiting 
regional characteristics and features can be applied to relevant areas, thus improving 
both accuracy and efficiency of the imaging system. For example, the segmentation 
of an image into distinct anatomical regions and structures provides perspectives on 

10 the spatial relationships between such regions. Segmentation also serves as an 

essential first stage of other tasks such as visualization and registration for temporal 
and cross-patient comparisons. 

Image registration is a process of alignment of medical imaging data for 
faciUtating comparisons and medical diagnosis. Image registrations of digital images 

15 allow doctors to visuahze and monitor physiological changes in a patient over time or 
to keep track of the growth or decline of lesions. For example, image registration 
enables doctors to identify, compare and determine the growth of a malignant lesion 
or nodule. A comprehensive survey of existing medical image registration is given in 
"Medical Imaging Matching- A Review With Classification", van den Elsen, Pol, and 

20 Viergever, IEEE Engineering in Medicine and Biology, Vol. 12, No. 1, pp. 26-39, 
1993. This reference is hereby incorporated for all purposes. 

Anatomical registration is a difficult problem because anatomical structures 
vary widely in appearance and location within different patients. Even within the 
same patient, the passage of time often brings about great variations. Variations with 

25 respect to the patient and patient anatomy are generally referred to as intrinsic 
variations. Intrinsic, or local, variations refer to structural differences between 
anatomical structures. For example, intrinsic variations result when image acquisition 
is performed out of phase (i.e., taking the image during a different breath-phase or 
heart-beat), but are also unavoidable due to differences between patients and changes 

30 in patients over time. 

The task of mapping an image of a particular anatomical structure to another 
image is difficult also due to variations between images. For example, variations can 
occur due to differences in image acquisition environments, differences between 
patients (for a cross-patient comparison), and/or corporeal changes over time (for a 
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temporal comparison). Variations due to factors external to a patient anatomy are 
generally referred to as extrinsic variations. Certain extrinsic, or global, variations 
result from the lack of standard protocols or orientations for image acquisitions. For 
example, even if attempts are made to take images along a principal axis, the actual 
axis along which cross-section images are obtained may be at an angle to the principal 
axis. Additionally, the use of different fields-of-view, dose, and varying patient size 
all contribute to extrinsic variations. 

Most existing medical image registration methods require user interaction 
and/or are computationally expensive or intensive. One method commonly used for 
medical image registration is based on the mutual information approach. The mutual 
information approach is based upon the maximization of mutual information criteria 
used to select attributes conditioned on prior registrations. Having its origin in 
information theory, mutual information criteria generally include a numerical value 
that indicates how well each attribute discriminates a chosen label attribute, i.e., the 
relatedness of one random variable to another is based upon a measure of the 
variables' entropies. One example can include two-dimensional gray-scale 
histograms of image pairs that are used for registration. One drawback of the mutual 
information approach is that it generally does not account for different types of 
variations (i.e., extrinsic and intrinsic variations) between images. Another drawback 
of the mutual information approach is that entire data sets of the image pairs need to 
be analyzed. Thus, it is very time consuming and inefficient. 

Correlation-based approaches involving cross-correlation or cross-covariance 
of image pixel intensities provide methods of image registration. However, such 
approaches are generally not accurate for image content and noise that exhibit large 
variations and differences. 

Another existing method, the atlas model approach, requires the use of a pre- 
determined "atlas model" that characterizes an anatomical structure being registered. 
In this approach, generic anatomical structures are used as a pattern for the structure 
being registered. However, since medical scan procedures do not always scan an 
anatomical structure in its entirety, or in a particular orientation, and since the 
population being scanned for pathologies is likely to exhibit abnormalities or 
variations in his/her anatomical structures, the atlas model approach often results in 
mismatched registrations. Moreover, the atlas model is not effective in tracking 
lesions that completely appear or disappear from one image set to another. 
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Thus, it is desirable to provide systems and methods for registering images 
based on selective sampling that are insensitive to partial scans. It is yet further 
desirable to provide systems and methods for registering images that support temporal 
and cross-patient comparisons. Moreover, it is desirable to provide systems and 
methods for registering images that can reduce computation time through sampling or 
without always having to analyze an entire data set. On the other hand, it is desirable 
to provide systems and methods of registration that improve accuracy and resolution 
by using relevant information from image sets to compute the image registration or 
fusion. It is further desirable to provide systems and methods for registering images 
that provide input data for other computations such as volume and size measurements. 
It is also desirable that the systems and methods provided for registering images 
support various data acquisition systems, such as CT, PET or SPECT scanning and 
imaging. 

It is an object of the present invention to address and incorporate the above 
considerations. A further object of the present invention is to provide systems and 
methods for registering images based on selective sampling that are insensitive to 
patient anatomical abnormalities or variations. A further object of the present 
invention is to bring corresponding structures into aUgnment or otherwise align one 
image or sequence with a corresponding image or sequence. Additionally, it is also 
desirable to provide a system and method that can detect and track vanishing and 
appearing objects, such as lesions, within the context of the larger surrounding 
anatomy. The present invention provides a system and method that is accurate and 
displays high levels of physiological detail over the prior art without specially 
configured equipment. 

Summary of the Invention 

This invention employs a hierarchical method that is based on multi-scale 
computation and motion-tracking algorithms. In particular, a global similarity 
transformation is applied to align anatomical structures in a global manner. A local 
similarity transformation is then appMed to fine-tune and adjust internal details. 

Once patient scans have been obtained, the invention is capable of registering 
a first scan to a second scan in 2-D or 3-D. The systems and methods disclosed herein 
for mapping or autofusing one image to another image compensates for both extrinsic 
and intrinsic variations. In one aspect, a hierarchical registration algorithm is used to 
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address the two types of variations separately. The output of the registration provides 
the input necessary for other automated computations, such as volume and size 
comparisons and measurements. 

Brief Description of the Figures 

FIGURE 1 illustrates an exemplary position of a patient in a CT scanner in 
accordance with prior art; 

FIGURE 2 illustrates an exemplary CT volume of a patient obtained via a CT 
scanner in accordance with prior art; 

FIGURE 3 illustrates an exemplary axial slice taken with a CT scanner; 

FIGURE 4 illustrates an exemplary hierarchical registration algorithm in 
accordance with an embodiment of the invention; 

FIGURE 5 illustrates an exemplary global similarity transformation process in 
accordance with an embodiment of the invention; 

FIGURE 6 illustrates an exemplary process for determining optimal 
transformation and rotation parameters in accordance with an embodiment of the 
invention; 

FIGURE 7 illustrates an exemplary identifiable points and contours selection 
in thoracic images in accordance with an embodiment of the invention; and 

FIGURE 8 illustrates an exemplary local similarity transformation process in 
accordance with an embodiment of the invention. 



Detailed Descrit)tion of the Invention 

The present invention is a system and method for the mapping and image 
registration of 3-D image data volumes. Image volumes can be displayed on a 
graphical user interface ("GUI") to provide comparison information for medical 
diagnosis and physiological evaluation. The system and method can display various 
planar views and allows for highlighting ROIs and receiving user input regarding 
specific image data to be presented and selected. 

According to one system and method of the present invention, image 
registration inputs and outputs displayed on a GUI preferably include two 3-D image 
data volumes where one of the 3-D image data volumes is held fixed while voxels 
from the image data of a second set may be scaled, rotated, and translated to map 
anatomic features. Additionally, the GUI preferably allows for the selection and 
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update of various planar and volumetric images by inputting commands (for example, 
by dragging/clicking a cursor in a particular display window) with no delay apparent 
to the user. Additionally, data volumes may rotated, updated or selected with respect 
to fixed data. 

Image registration is characterized as the aligning and combining of multiple 
images of a region of interest in 2-D and/or 3-D. Image registration and comparison 
permits the derivation and extraction of diagnostic and physiological information 
about an ROI that would be unavailable from any one of an digital images simply in 
2-D or 3-D without further processing. 

The digital image sections to be processed, rendered, displayed or otherwise 
used includes digitized images acquired through any plane, including, without 
limitation, saggital, coronal and axial (or horizontal, transverse) planes and including 
planes at various angles to the saggital, coronal or axial planes. While the disclosure 
may refer to a particular plane or section, such as an axial section or plane, it is to be 
understood that any reference to a particular plane is not necessarily intended to be 
hmited to that particular plane, as the invention can apply to any plane an orientation 
acquired by any digital acquisition system. 

Figure 1 illustrates a position of a patient in a digital acquisition system, such 
as computer tomography (CT) scanner. Figure 2 illustrates an exemplary CT volume 
of the patient obtained via the CT scanner. A CT volume is a series of parallel cross- 
section images taken along a planar axis. Figure 3 illustrates an exemplary axial slice 
in the lungs taken with a CT scanner. For example, a typical CT axial image is 512 x 
512 array of 12-bit gray scale pixel values. Such an image has a spatial resolution of 
approximately 500 microns. Other digital acquisition techniques also yield images 
that may be stacked to create digitized volumetric data that may be used in accordance 
with the invention disclosed herein. 

The present invention is preferably performed on a computer system such as a 
Pentium™ -class personal computer running computer software that implements the 
algorithm of the present invention. The computer includes a processor, a memory and 
various input/output means. One or more digital images are accessible by the 
computer using conventional storage and input means. 

The algorithm disclosed herein automates the process of registering 
anatomical structures in digital images of the same or similar physiological structures. 
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The methods disclosed herein may be used for registering scans of the same patient 
over time or for cross-patient registration. The algorithm is capable of detecting and 
tracking vanishing and appearing objects, such as lesions, vv^ithin the context of the 
larger surrounding anatomy. 

A hierarchical algorithm is used to address global and local variations 
separately. A first part global similarity transformation is applied to align anatomical 
structures together in a gross manner. A second part local similarity transformation is 
appUed to fine-tune and adjust internal details. The hierarchical approach takes into 
account both extrinsic and intrinsic variations. The invention is automated and does 
not require manual input. The assumptions made are appropriate for the medical 
imaging domain. 

Automated ahgnment and registration performed by the algorithm disclosed 
herein achieves accurate and high-speed updates of the data image volumes. 
According to one aspect of the present invention, registration images can be displayed 
while at the same time displaying other views and/or data used for creating the 
registered image on a computer display. For example, the display could show one or 
more digital image slices used to create a registered image along with the registered 
image itself. Further teachings relating to displaying various 2-D and 3-D images over 
a GUI, means for receiving and responding to user inputs, marking and identifying 
ROIs, updating digital images for display, displaying temporal and cross-patient 
information and acquiring, accessing and providing detailed physiological 
measurement information is disclosed in our co-pending application "Graphical User 
Interface for Display of Anatomical Information", Serial No. 60/252,743, filed 
November 22, 2000 which has incorporated herein by reference. The image 
registration disclosed herein equally applies to any display or system for displaying 2- 
D and 3-D registration images. 

The image registration disclosed herein supports matching of image objects 
using a GUI and an automated method to compute the alignment and registration of 
the separate images without user interaction. These combined actions and 
optimization techniques and approaches according to the present invention are 
complementary and the alignment and registration that results is more accurate and 
computationally effective than prior methods. As will be further described and 
detailed herein, the system and method accomplish the image registration by the 
following general approach. First, two sets of image data scans are obtained. Next, 
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global and local similarity transformations are performed on the scans. Scaling, 
rotation and translation transformations are performed to account for global 
variations. Local similarity transformation processes are performed to compensate for 
intrinsic differences. Highly identifiable points or features points are identified for 
use as transformation reference markers. Transformation weighting is typically 
performed for pixels not related to feature points. 

More details on nodule detection in 3-D images can be found in the above- 
referenced applications "Pleural Nodule Detection from CT Thoracic Images," Serial 

No. (attorney docket number 8498-045-999), and "Density Nodule Detection 

in 3-Dimensional Medical Images," Serial No. (attorney docket number 

8498-035-999). 

Figure 4 illustrates an exemplary hierarchical registration algorithm and 
overview flowchart in accordance with an embodiment of the invention. New patient 
scan 401 and an old patient scan or a different patient scan 403 are collected at step 
402. Next, a global similarity transformation process is performed on scans 401 and 
403 (step 404). One global similarity transformation process approach is described 
below in relation to Figure 5. Next, a local similarity transformation process is 
performed on scans 401 and 403 at step 406. An exemplary local similarity 
transformation approach is described in relation to Figure 8 below. Finally, a 
transformed new patient scan (scan A) 405 and a different patient scan (scan B) 407 
are provided as the output (step 408). 

Global variations are addressed by a global similarity transformation. In one 
embodiment, the global similarity transformation is a so-called six-degree 
transformation that employs scaling, rotation, and a translation process to account for 
the global variations in 3-D. The transformation can also accommodate and employ 
2-D scaling and rotations. 

Figure 5 illustrates a global similarity transformation process in accordance 
with an embodiment of the invention. Scan pyramids are constructed for scans 401 
and 403 at step 502. A scan pyramid, or simply pyramid, is a sequence of copies of 
an image in which both sample density and resolution are decreased in successive 
levels. That is, adjacent levels of the pyramid proceed in increasing coarseness and 
indicate a level of subsampling performed. For example, the finest scale is the 
original image and the coarsest scale is the topmost level of the pyramid. As an 
illustration, the pyramids for the scan 401 and the scan 403 could have three levels. 
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To obtain a next level of a pyramid from a previous level, a filtering operator may be 
applied (e.g., a 3-D Gaussian or Gaussian-like kernel) followed by a sampling 
operator that typically samples every other pixel or voxel. Depending on desired 
efficiency and resolution, computations for the global similarity transformation can be 
done on sampled or sub-sampled scans. 

Using the constructed scan 401 pyramid and scan 403 pyramid, a set of 
optimal scale factors is determined (step 504). These scale factors refer to a level of 
magnification or demagnification relative to an image size to equahze, for example, 
zoom levels during image acquisition. Processes to determine a set of optimal scale 
factors will be described below. Once a set of optimal scale factors is determined, a 
uniform scaling is performed on scan 403 using the set of optimal scale factors to 
obtain a rescaled scan B (step 506). The scan 403 pyramid is reconstructed after the 
uniform scaling (step 508). Next, a set of optimal translation and rotation parameters 
is determined (step 5 10). One process for determining a set of optimal translation and 
rotation parameters is described in Figure 6. 

In one embodiment, a six-degree transformation is performed on rescaled scan 
403 based on the set of optimal translation and rotation parameters to obtain a 
transformed and rescaled scan B (step 512). Various six-degree transformation 
processes are known in the art and can be used for the systems and methods disclosed 
herein. The output of the various transformation processes is the transformed and 
rescaled scan B (step 514). 

In an exemplary embodiment, each scan 401 or scan 403 is a 3-dimensional 
image volume. The z-axis as used herein is defined as the axis perpendicular to scan 
sUces (i.e., the z-axis is ahgned along the route of the scanner). X- and y- axes are 
defined in a right-handed system anchored about the defined z-axis. All the voxels in 
the first volume (scan A) be defined as the set Ia, such that lA(x,y,z) specifies the 
intensity value of a particular voxel in the first volume with coordinates [x, y, z] with 
respect to an origin defined at one comer of the volume. For example, in CT thoracic 
scans, the origin is defined at the right, anterior, and superior comer of the body. All 
the voxels in the second volume (scan B) be defined as the set Ib in the same manner. 
New coordinates of a voxel after a transformation T operating on [x,y,z] in the first 
volume (scan A) are represented as TA[x,y,z]. 
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In a representative embodiment, a voxel is 500|j, x 500|i x (1000^ to 3000|i). 
A ROI is approximately 3000}x to 20,000|X in diameter and comprises up to 40 x 40 
voxels in the x- and y- directions and up to 20 voxels in the z- direction. 

In an exemplary embodiment, the scale factors are determined in the x, y, and 
5 z directions through division once the voxel size is known in both scans 401 and 403. 
For example, if the voxel size of scan 401 is (xa,ya,za) and the voxel size of scan B is 
(xb,yb,zb), then the scaling parameters are (xa/xb, ya/yb, za/zb). The scaling matrix 
(where all coordinates of the scans A 401 and B 403 have been homogenized and the 
origin translated to the center of the volume) is as follows: 

^000 

iri 0 — 00 

10 yb 

0 0—0 
0 0 0 1 

In some systems, the voxel size information is known. If the voxel size in 
both scans 401 and 403 is unknown, some segmentation of the scanned anatomical 
structure is performed. Segmentation techniques can include processes for separating 

1 5 irrelevant objects (for example, the background and foreground) or for segmenting 
(i.e., extracting) anatomical surfaces from images for the purposes of anatomical 
identification, diagnosis, evaluation, and volumetric measurements. The segmentation 
of image volume into distinctive anatomical regions and structures provides 
perspectives on the spatial relationships between such regions. Once anatomical 

20 regions and structures are segmented out, subsequent processing and analysis 
exploiting regional characteristics can be applied to relevant areas. From the 
segmentation, the anatomical structure's largest span in the x-, y-, and z- directions 
can be determined by various known methods. When the largest spans in the x-, y-, 
and z- directions are obtained for both scan A 401 (whose span in the x-, y-, and z- 

25 directions is identified as Xa, Ya, Za, respectively) and scan B 403 (whose span in the 
X-, y-, z- directions is identified as Xb, Yb, and Zb, respectively), the scaling matrix 
(where all coordinates of the scans A 401 and B 403 have been homogenized and the 
origin is translated to the center of the volume) is as follows: 

0 0 o" 

0 % 0 0 
0 0 % 0 
0 0 0 1 
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Typically, the scaling matrix is built on the assumption that there is a generally 
similar orientation of the anatomical structures in both scans. If such an assumption 
does not hold, the scaling steps 504-506 are skipped. Instead, an orientation effect is 
considered in the rotation/translation matrix, described below. In other words, scaling 
can be performed separately or as part of a rotation/translation step. 

Figure 6 illustrates one process for determining optimal translation and 
rotation parameters in accordance with an embodiment of the invention. Generally, 
optimal translation and rotation parameters are determined based on a selected 
number of roughly matching points in the scan 401 and the scan 403. The selection of 
roughly matching points depends on the type of scan and the structure being scanned. 
Typically, specific identifiable points on the boundary or identifiable contoxirs are 
selected as matching points. Identifiable points are locations in a structure (e.g., 
trachial bifurcations or specific bone formations) that are present in other 
similar/same anatomical structures and appear singularly in scans that are being 
analyzed. Identifiable contours are curves that can be parametrized and are capable of 
binding the anatomical structure in a singular descriptive way. Generally, a rough 
estimation of the boundaries of an anatomical structure is performed for the selection 
of rough matching points. At step 602, an anatomical strucuture in both scan 401 and 
scan 403 is segmented to obtain segmented anatomical maps. In an exemplary 
embodiment, the segmentation is performed to the extent necessary for selecting 
roughly matching points. For example, an anatomical segmentation is described in 

the "Lung Field Segmentation from CT Thoracic Images," Serial No. , 

(attorney docket number 8498-044-999) and incorporated by reference. 

Next, the segmented anatomical maps are used to reduce data size by 
subsampling both scans 401 and 403 (step 604). In an exemplary embodiment, 
corresponding, non planar points from the segmented anatomical maps of both scans 
401 and 403 are then identified (step 606). In an exemplary embodiment, for a given 
scan, at least four non-coplanar identifiable points, one contour that is non-coplanar 
with an identifiable point, or a minimum of two non-coplanar contours with no 
identifiable point are selected. In one embodiment, for every selected contour, three 
non-colinear points can be generated and used as identifiable points, as long as the 
non-colinear points were generated in the same manner as the parameterization of the 
contours in both scans. Thus, the scan 401 has a set, q, comprising four or more 
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identifiable points that corresponds to a set, p, comprising four or more identifiable 
points in the scan 403. 

Next, based on the identified points for each scan 401 or 403, an over- 
constrained system of equations is constructed and solved for the transformation 
matrix (step 608). For example, if P and Q are matrices containing the homogenized 
coordinates of the points in p and q, respectively, for n points contained in p and q, P 
and Q would each be an n-by-four matrix, where the first column contains all the x- 
coordinates of the points, the second column contains the y-coordinates, the third 
column contains the z-coordinates, and the fourth column contains the homogenized 
coordinate. If the set 

p = { (plx, ply, plz), (p2x, p2y, p2z),. .. (pnx, pny, pnz)}, the corresponding P matrix 
would be constructed as follows: 



plx ply plz 1 
plx p2y p2z 1 



Assuming that the points in P and Q are corresponding points in the two scans 
(which they ought to be by construction), the transformation matrix can be found by 
an over-constrained system of equations described by: 
P = M •Q 

where the raised dot expresses matrix multiplication and M is a four-by-four 
transformation matrix. 

An over-constrained system of equations does not always have a singular 
solution. In such a case, the solution that comes closest to satisfying the system is 
selected. A solution is closest when it minimizes the sum of squares of the difference 
between the right- and left- hand sides of P^M •Q. 

In an exemplary embodiment, a singular value decomposition of the Q matrix 
is used to solve the least squares problem as discussed in "Computer Methods for 
Mathematical Computations," Forsythe, G.E., Malcolm, M.A., and Moler, C.B. 1977, 
(Englewood Cliffs, NJ Prentice-Hall), Chapter 9 and in "Matrix Computations," 
Golub, G.H., and Van Loan, C.F. 1989, 2""^ Edition (Baltimore: Johns Hopkins 
University Press), §8.3 chapter 12, §5.2.6. The singular value decomposition (SVD) 
decomposes Q into three matrices: 
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where U is an n-by-four column-orthogonal matrix, w is a four-by-four 
diagonal matrix with Q's singular values in its diagonal (positive or zero), and V"^ is 
the transpose of V, which is a four-by-four orthogonal matrix. From this 
5 decomposition, the inverse matrix, Q \ of Q can be computed as follows: 



After Q"^ is obtained, M is solved by multiplying both sides of the equation: 
p = M»Q byQ^ 

F = M •e 



M is then used to transform the rescaled scan 403 into a rescaled and 
transformed scan 403 (step 610). Next, error is measured using sum of squared 
Mahalanobis distance between the transformed scan 403 and the scan 401 (step 612). 
In an exemplary embodiment, the sum of Mahalanobis distance between the 
intensities of corresponding voxels in the two scans is represented by the following 



equation: 



where G is a global transformation matrix. 

The Mahalanobis distance prevents biasing of a good transformation as a 
result of small intensity variations due to scanner noise. The Mahalanobis distance 
effectively classifies intensities before the intensities are compared. This approach is 
appropriate because two matching locations with the same tissue type may be scanned 
at slightly different intensities due to scanner differences that should not be 
considered an unacceptable error. The error that should be measured is an error 
resulting from an incorrect matching between completely different tissue types. In 



other embodiments, the sum of squared difference or root mean square error methods 
may be used instead. 

Next, the error is compared to a threshold error and a previous error (step 
614). If the error is less than the threshold error and the previous error, the process 
ends (step 616). If the error is greater than or equal to the threshold error or the 
previous error, the process repeats at step 606 until the error becomes acceptable. In 
an exemplary embodiment, the threshold error is predetermined by a user. 

In an exemplary embodiment, if the anatomical structure in a scan cannot be 
segmented, an iterative optimization algorithm described in "Hierarchical model- 
based motion estimation," Bergen et al. Proceedings of Second European Conference 
on Computer Vision, Springer-Verlag, p. 237-252, 1992, is used. The transformation 
parameters (three translation and three rotation) found by Bergen's method are then 
used to define M, the composite transformation matrix. 

In general, it is important that an appropriate segmentation approach is 
selected (e.g., step 602 above). Whether a segmentation approach is appropriate 
depends on the scanning methodology and anatomical structures being scanned. 
Some scanning methods lend themselves to simple thresholding. For example, 
computer tomography is a method in which the intensity of a voxel in a scan has a 
direct physical interpretation relative to the density of the tissue being scanned. In 
such a case, a computationally intensive segmentation procedure is typically not 
worthwhile when slightly inferior results can be computed using the smoothing-and- 
thresholding technique. Similarly, some anatomical structures have more easily 
identifiable points while others have more easily identifiable contours. For example, 
in the case of thoracic images, it is relatively simple to segment out the rib cages in 
the two scans and register the rib cages. But segmenting and registering the rib cages 
may not offer an accurate registration of the lungs since the lungs move independently 
inside the ribcage as the patient breathes. In an exemplary embodiment, the topmost 
points of both apices are selected as the identifiable points and the curves produced by 
the intersection of the diaphragm and the dorsal costal surfaces are selected as the 
identifiable contours for thoracic images. Figure 7 illustrates an exemplary point and 
contour selection for thoracic images in accordance with an embodiment of the 
invention. 

Due to intrinsic differences in the structures scanned, the global similarity 
transformation typically cannot achieve the degree of local accuracy desired. To 



compensate for intrinsic differences, an iterative motion-tracking algorithm for three- 
dimensional data set is preferably used to determine transformation parameters for 
local similarity transformations. In one aspect, employing motion-tracking algorithms 
has the advantage of not having to assume a rigid body transformation for an 
anatomical structure being registered such as with the atlas model. Figure 8 illustrates 
an exemplary local similarity transformation process in accordance with an 
embodiment of the invention. At step 802 a feature selection process is performed on 
the scan 401. The feature selection process ensures concentration on registering 
highly identifiable points that are hkely to correlate to other identifiable features in 
the scan 403. In a preferred embodiment, local similarity transformations are 
performed on the selected highly identifiable points (or feature points), not every 
point in the scan 401. In an exemplary embodiment, the feature points are selected 
using Shi-Tomasi's method. For more information, please see "An Iterative Image 
Registration Technique with an Application to Stereo Vision," Bruce D. Lucas and 
Takeo Kanade, Proceedings of the 7* International Joint Conference on Artificial 
Intelligence, 1981 [Lucas-Kanade] and "Good features to track," J. Shi and C. 
Tomasi, Proceedings of the IEEE Conference on Computer Vision and Pattern 
Recognition, pages 593-600, June 1994 [Shi-Tomasi]. These references are hereby 
incorporated for all purposes. After the feature points have been selected, for each 
selected feature point, a feature tracking process is performed (step 804). In an 
exemplary embodiment, local registration parameters are found using a 3-D iterative 
factorization technique. In an exemplary embodiment, the iterative factorization 
technique is as described by Lucas-Kanade for 2-D image registration in a stereo 
matching application as referenced above except adapted for 3-D images. The Lucas- 
Kanade technique was also described for finding optical flow fields in motion 
sequences in "A Paraperspective Factorization Method for Shaoe and Motion 
Recovery," Conrad J. Poelman and Takeo Kanade, Technical Report CMU-CS-92- 
208, Carnegie Mellon University, Pittsburgh, PA, October 1992. At step 804, features 
are accepted from two 3-D data sets to perform the feature tracking. 

The 3-D iterative factorization technique makes use of an error checking 
function such as a bilateral gradient similarity function. The bilateral gradient 
similarity function evaluates the similarity of the intensity isocontour at a feature 
point as compared to the surrounding intensity isocontour at a registered point. A 
measure of closeness can be determined by a threshold comparison set, for example, 
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via external visualization testing where an appropriate level of similarity is identified. 
The similarity measure is based on closeness in both magnitude and direction of the 
isocontour. High-magnitude isocontour vectors in the first scan (scan 401) (i.e., 
corresponding to high intensity variation at that point in the image) are determined to 
5 be similar to those in the neighborhood of the matching point in the second scan (scan 
403) when the points are similar in both magnitude and direction. This technique uses 
the direction of anatomical landmarks (e.g., vessels) to designate the "correctness" of 
the transformation. One advantage of using this technique, which relies on the 
intensity constancy assumption, is that intensity values generally have a designated 
10 meaning (e.g., tissue types) in medical imaging techniques (e.g., CT scanning). 

Additionally, the technique allows non-rigid transformations, which is a desired effect 
for local registration because the technique does not assume a single motion in the 
image. 

Next, for every point in the scan 401 that is not a "feature point", a 

15 transformation weighting is performed (step 806). This step allows the registration of 
items and objects that did not appear, or have vanished, to be registered with their 
context or environment as the items or objects would have been located. For every 
point that is not a feature point, in one approach a linear combination of local 
transformations for the feature points in its local neighborhood is performed. The 

20 weighting of the local transformations is relative to the feature points' distance to the 
point being registered. In one embodiment, the size of the neighborhood is 
predetermined by a user. For example, a neighborhood could be 50 x 50 x 50 pixels. 
In another embodiment, a default size of the neighborhood is hard-coded into 
memory. In the former embodiment, the size is usually determined empirically and is 

25 unique to the type of scan and the anatomical structure being scanned. The output of 
the transformation weighting is a scan map from the scan 401 to the transformed and 
rescaled scan 403, which was obtained as a result of the global similarity 
transformation process as described in Figure 5 above (step 808). 

It is noted that while the algorithm described herein is discussed in regards to 

30 3-D images, the algorithm is simplified and adapted for 2-D images. For 2-D images, 
the Lucas-Kanade technique need not be extended and can be used generally as-is. 

The foregoing examples illustrate certain exemplary embodiments of the 
invention from which other embodiments, variations, and modifications will be 



CAl -.269321 



17 



8498-043-999 



apparent to those skilled in the art. The invention should therefore not be limited to 
the particular embodiments discussed above, but rather is defined by the claims. 
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